A reinforcement learning agent has been developed to determine optimal operating policies in a multi-part serial line. The agent interacts with a discrete event simulation model of a stochastic production facility. This study identifies issues important to the simulation developer who wishes to optimise a complex simulation or develop a robust operating policy. Critical parameters pertinent to \u27tuning\u27 an agent quickly and enabling it to rapidly learn the system were investigated.
展开▼